Explora: Tackling Corpus Analysis with a Distributed Architecture

نویسنده

  • Leonel Merino
چکیده

When analysing a corpus of software, researchers often ask questions that entail exploration and navigation, such as “what packages contain fat interfaces in open-source systems?”, “how consistently is the code being commented?” and “are naming conventions being followed?”. The answers to these questions can impact software maintainability and evolution. Software visualisation can be of aid to understanding and exploring the answers to such questions, but corpus visualisations are timeconsuming and difficult to achieve since they require large amounts of data to be processed. We tackle this constrain by using a distributed architecture. In this paper we propose an environment where researchers can build queries for their questions and afterwards rapidly visualise them. We elaborate on a proof-of-concept tool named Explora and we report early results when visualising Qualitas Corpus [4]. This paper uses colours in the figures. Please read a coloured printout of this paper for a better understanding.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explora: Infrastructure for Scaling Up Software Visualisation to Corpora

Visualisation provides good support for software analysis. It copes with the intangible nature of software by providing concrete representations of it. By reducing the complexity of software, visualisations are especially useful when dealing with large amounts of code. One domain that usually deals with large amounts of source code data is empirical analysis. Although there are many tools for a...

متن کامل

EXPLoRA-web: linkage analysis of quantitative trait loci using bulk segregant analysis

Identification of genomic regions associated with a phenotype of interest is a fundamental step toward solving questions in biology and improving industrial research. Bulk segregant analysis (BSA) combined with high-throughput sequencing is a technique to efficiently identify these genomic regions associated with a trait of interest. However, distinguishing true from spuriously linked genomic r...

متن کامل

A Conversation Analysis of Ellipsis and Substitution in Global Business English Textbooks

Despite the body of research on textbook evaluation from the discourse analysis perspective, cohesive devices have rarely been analyzed in English for Specific Purposes (ESP) textbooks. The acquisition and use of cohesive devices is inherent to naturalistic communication, including business interactions. Hence, L2 learners of business English should be exposed to these devices through cohesion-...

متن کامل

Adequacy of the Endometrial Samples Obtained by the Uterine Explora Device and Conventional Dilatation and Curettage: A Comparative Study

Aims. Our aim is to compare the adequacy and diagnostic yield of samples obtained by the endometrial Explora Sampler I-MX120 with endometrial specimens obtained by conventional dilatation and curettage (D&C). Methods. A total of 1270 endometrial samples were received in the histopathology laboratories at the King Khalid University Hospital, Riyadh, Saudi Arabia, between 2007 and 2010. In the ou...

متن کامل

The EUDICO Project, Multi Media Annotation over the Internet

In this paper we dsecribe a software environment that facilitates media annotation and analysis of media related corpora over the internet. We will describe the general architecture of this environment and we will introduce our Abstract Corpus Model with which we isolate corpora specific formats from the annotation and analysis tools. The main set of tools is described by giving examples of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014